Feat(Dir/Difference): See Difference Between Dir Files & Content #20

toksdotdev · 2018-08-07T01:09:02Z

Print out the difference between any two directories to stdout. This includes:

Difference between lines in files
Existence of a file

A sample output is:

╔════════════════════════════════════════════════════════════════════════╗
║                               Differences                              ║
╠════════════════════════╦═══════════════════════╦═══════════════════════╣
║ Filename               ║ tests/easy/bad/dir1   ║ tests/easy/bad/dir2   ║
╠════════════════════════╬═══════════════════════╬═══════════════════════╣
║ /sample.txt            ║ FILE EXISTS           ║ DOESN'T EXIST         ║
╠════════════════════════╬═══════════════════════╬═══════════════════════╣
║ "/test.txt":1          ║ testing testing       ║ oh no!                ║
╠════════════════════════╬═══════════════════════╬═══════════════════════╣
║ "/test.txt":2          ║ skdjjd                ║                       ║
╚════════════════════════╩═══════════════════════╩═══════════════════════╝

Tests created for any new feature or regression tests for bugfixes.
cargo test succeeds
cargo +nightly clippy succeeds

epage

Hey, thanks for contributing!

A question I have though is if a view (e.g. table) belongs in dir-diff or only a model (e.g. iterator of results).

This is related to issue #11 which I experimented with in #17. I'm mixed on the results there and haven't had time to further experiment on it.

Additionally, I've been curious about integrating dir diffing into predicates so it can work with a testing library like assert_fs. See assert-rs/predicates-rs#33

src/lib.rs

Cargo.toml

src/lib.rs

epage · 2018-08-07T03:10:44Z

Also, could you share what your use case is? It'd be helpful to better understand the larger picture of what you are trying to accomplish.

epage

Thanks for responding to my first round of feedback. In this round, I cover some more design questions and trade offs. You are welcome to go ahead and work on them and we can continue to iterate on each round until we come to a design that works.

I will note that I've known people who tend to get burnt out with that though and I don't want that to happen with you. Something that can help is first discussing the requirements in #11 and assert-rs/predicates-rs#33 and proposing a solution in #11. We can then iterate in the comments of #11 on the concepts before taking the time to write the code.

epage · 2018-08-18T20:28:05Z

Cargo.toml


 [dependencies]
 walkdir = "2.0.1"
+matches = "0.1.7"


Looks like this is only used for tests. This should be under dev-dependencies so that clients don't get this added to their builds.

For an example, see https://github.com/assert-rs/assert_cmd/blob/master/Cargo.toml

epage · 2018-08-18T20:44:45Z

src/lib.rs

+        .collect::<Vec<_>>();
+
+    if files_a.len() != files_b.len() {
+        return Err(Error::MissingFiles);


This is an interesting trade-off between performance and information.

For people who want fast results, this is great. For people who want to know what all is different, this doesn't work out too well.

Two things about this.

First, if we're trying to optimize, I'm unsure which half of this is slower, the part above here (walk two directory tries, put all content into Vec) or the part below (diffing content).

Second, I think it might be better to offer the client the choice.

We can handle both of these if we structured this as an iterator over two directories and then just offered them a diff function on the "tuple", they can then choose whether to filter for all differences, end on the first difference, etc. I'm assuming we can make the iterator by first walking one tree, checking for what exists in the other tree, and then walk the second tree, checking for what doesn't exist in the other tree.

This is how cobalt's diffing works in its tests

epage · 2018-08-18T20:45:29Z

src/lib.rs

+        let full_path_a = &a_base.as_ref().join(&a);
+        let full_path_b = &b_base.as_ref().join(&b);
+
+        if full_path_a.is_dir() || full_path_b.is_dir() {


But what if one is a directory and the other isn't? Won't this silently ignore that?

epage · 2018-08-18T20:47:06Z

src/lib.rs

+
+    for (a, b) in files_a.into_iter().zip(files_b.into_iter()).into_iter() {
+        if a != b {
+            return Err(Error::FileNameMismatch(a, b));


While this is convenient for writing, I don't think users will understand what a file name mismatch means. Instead this is about one of the two files is missing in one of the trees. Ideally, we tell them that and tell them which one is missing.

epage · 2018-08-18T20:49:11Z

src/lib.rs

+                let mut a_lines = content_of_a.lines().collect::<Vec<&str>>();
+                let mut b_lines = content_of_b.lines().collect::<Vec<&str>>();
+
+                if a_lines.len() != b_lines.len() {


Rather than writing our own file diffing, we should probably use the difference crate .

This is a re-phrasing of a previous posting that is now collapsed:

Depending on how we solve binary files, should this instead use the difference crate rather than re-implementing file diffing ourselves?

epage · 2018-08-18T20:52:31Z

tests/smoke.rs

+
+#[test]
+fn missing_file() {
+    assert_matches!(


This is neat. I've never heard of this crate before. I'll need to see where it'd simplify my tests.

epage · 2018-08-18T20:53:50Z

tests/smoke.rs

+#[test]
+fn missing_file() {
+    assert_matches!(
+        dir_diff::see_difference("tests/missing_file/dir1", "tests/missing_file/dir2"),


For every test case, we should probably ensure see_difference returns the same result as is_different.

epage · 2018-08-18T20:58:13Z

src/lib.rs

    Ok(!a_walker.next().is_none() || !b_walker.next().is_none())
 }

+/// Identify the differences between two directories.


Context: This is more for design input then to implemented right now

We are only offering directory diffing. Subset checks would also be very valuable, possibly more valuable at least when writing tests to keep the "golden" case simple to minimize expanding a test to cover more than is intended.

Later I mention

We can handle both of these if we structured this as an iterator over two directories and then just offered them a diff function on the "tuple", they can then choose whether to filter for all differences, end on the first difference, etc. I'm assuming we can make the iterator by first walking one tree, checking for what exists in the other tree, and then walk the second tree, checking for what doesn't exist in the other tree.

That makes it really easy for us to also offer a subset check, we only do the first iteration. One more benefit for the iteration approach.

epage · 2018-08-18T20:59:08Z

src/lib.rs

 //!
 //! assert!(dir_diff::is_different("dir/a", "dir/b").unwrap());
-//! ```
+//!


Why was the closing of the example code fence removed?

epage · 2018-08-18T20:59:53Z

src/lib.rs

    StripPrefix(std::path::StripPrefixError),
    WalkDir(walkdir::Error),
+    /// One directory has more or less files than the other.
+    MissingFiles,


Keep this for now to minimize churn as this PR evolves but I suspect we'll want to split out the diffing error information.

toksdotdev · 2018-12-08T08:14:50Z

Sorry, I haven't had time to work on this, as I've been busy for the past months. I'll try to address this ASAP.

Sorry for the delay.

epage · 2018-12-08T17:34:32Z

Understandable; we all have those times.

toksdotdev added 4 commits August 1, 2018 17:30

Chore(Stage): Some Improvements

f727e45

Chore(Stage): Some Improvements

47df031

Feat(Dir/Difference): See Difference Between Dir Files & Content

0ad3fe6

Chore(Docs): Add Missing Documentation

f3c03aa

epage requested changes Aug 7, 2018

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

toksdotdev added 3 commits August 18, 2018 11:34

Feat(SeeDifference): Return Structured Differences

2704e02

Chore(Tests): Add Tests For Structured Diffences

fde981c

Chore(Doc): Improve Documentation

aaa2cac

epage requested changes Aug 18, 2018

View reviewed changes

Feat(Dir/Difference): See Difference Between Dir Files & Content #20

Are you sure you want to change the base?

Feat(Dir/Difference): See Difference Between Dir Files & Content #20

Uh oh!

Conversation

toksdotdev commented Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

epage left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

epage commented Aug 7, 2018

Uh oh!

epage left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toksdotdev commented Dec 8, 2018

Uh oh!

epage commented Dec 8, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

toksdotdev commented Aug 7, 2018 •

edited

Loading